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[57] ABSTRACT 

A system for matching images in which characteristic 
points of an image to be tested for a match, such as a 
fingerprint, are compared with characteristic points of a 
master image by attempting to match the distances be- 
tween pairs of master characteristic points with dis- 
tances between pairs of live characteristic points, 
whereby the coordinate system of the test unage is not 
required to be aligned with the coordinate system of the 
master image. TTie matching system can be imple- 
mented in an identification mode in which the live 
image is attempted to be matched with each of a number 
of master images, or a verlQcation mode in which the 
live image is attempted to be matched with a master 
image that is purported to be the same as the live image. 
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If the difference between the compared distance val- 
ues satisfies the specified tolerance, matchcount is incre- 
mented by one and the next master distance value is 
attempted to be matched with one of the remaining live 
distance values. If the difference value dspec is greater 
than spec-epsilon for a particular live distance value, the 
next largest live distance value is attempted to be com- 
pared with the same master distance value. This evalua- 
tion continues until all distance values in the master 



for an overall match to be indicated by the system. This 
value is set at 63%, although this value also may be 
chosen in accordance with the desired stringency of the 
matching system. 

sigma-matches is the running total of the number of 
matched distance values in the matched master distance 
spectra. 

select is a linear array have n elements which will 
contain, for each live distance spectrum, the value of 



distance spectrum have been attempted to be matched, 10 the greatest number of matches between the distance 



or the distance values in the live distance spectrum have 
been exhausted, or it is not possible to match further live 
distance values with the remaining master distance val- 
ues. 

At the end of the operation of SPEC-.COMPARE, 
the number of distance values in* the master distance 
spectrum that have been matched to distance values in 
the live distance spectrum, i.e., matchcount, is returned 
to the computational loop of the ANALYZE-.SPEC- 



values in that live distance spectrum and distance values 
in the various master distance spectra. 

totaMements is equal to the total number of possible 
pairings of master minutiae and live minutiae. 
15 Turning to the portion of the IMAGE-JylATCH 
flowchart shown in FIG. lOA, the loop at the bottom of 
FIG. lOA searches through spectrum-match to find for 
each live distance spectrum the master distance spec- 
trum that matches it the best. This is done by fmding the 



TRUM subroutine and stored in spectrum match (See 20 master distance spectrum with which the live distance 



FIG. 8). SPEC-COMPARE will be called once by 
ANALYZE^SPECTRUM for each pair combination 
of a master distance spectrum and a live distance spec- 
trum, for a total of mastercountxlivecount times, llius, 
spectrum match will have (mastercountxlivecount) 25 
entries, where each entry contains the number of dis- 
tance values that were found by SPEC-COMPARE to 
be matched between a unique pairing of a master dis- 
tance spectrum and a live distance spectrum, link addr. 



spectr\mi has the most matched distance values, link- 
addr provides the index to spectrum-match to keep track 
of the live distance spectrum with which each location 
of spectrum-match is associated. When the closest 
matching master distance spectrum is found for a partic- 
ular live distance spectrum, the number of matches 
between distance values of the two spectra is stored for 
that live distance spectrum in the select array. After all 
of the values in spectrum-match have been evaluated 



the index to spectrum-match will also have (master- 30 (i»e., i^total-elements)^ the final match evaluation is 



countxliyecount) entries which list live distance spec 
trum row numbers, one through livecount repeated mas- 
tercount times. It is these arrays, spectrum match and 
link-addn that are fmally returned by ANALY2E— S- 
PECTRUM to the IMAGE-VERIFY program. 35 

The final evaluation of the data representing the cor- 
relation between distance values of the master and live 
distance spectra is performed in the IMAGE—MATCH 
subroutine, which is called once near the end of the 
IMAGE-VERIFY program following the execution 40 
of the ANALYZE-SPECTRUM subroutine (see FIG. 
SB). IMAGE-MATCH receives- the spectrum-match 
and the link-addr arrays and returns the Boolean vari- 
able is-a-match which indicates whether an overall 
match exists. The IMAGE— MATCH subroutine will 45 
now be described in relation to FIGS. lOA-lOB. 

spectrum-match link-addr, mastercount and livecount 
are defined the same as in the ANALYZEwSPEC- 
TRUM subroutine. 

spec-match-threshold is a preselected number repre- 50 
senting the minimum proportion of distance values in a 
master distance spectrum that must have found matches 
in a Uve distance spectrum in order to consider the 
master distance spectrum as being matched to the asso- 
ciated Uve distance spectrum. This value is arbitrarily 55 
set at 67%, although a larger or smaller number may be 
chosen in accordance with the desired stringency of the 
matching system. 



performed, as illustrated in FIG. lOB. 

The loop in the programming shown in FIG. lOB 
checks the number of the most distance matches for 
each live distance spectrum, and if that number is equal 
to or greater than accept-spectrum that number of dis- 
tance matches is accumulated in sigma-matches. After 
the loop evaluates all live distance spectra, sigma-mat- 
ches will equal the number of matched distance values in 
the master distance spectra that are deemed to be 
matched with live distance spectra, match % is then 
formed as the ratio of the number in sigma-matches to 
tiie number in total-elements. In the present embodi- 
ment, if the resulting proportion exceeds 63%, i.e., mini- 
mum-match %, then the live fingerprint image 'being 
tested is deemed matched. This is indicated in 
IMAGE_-MATCH by returning a "true" value to the 
IMAGE— VERIFY program in is-a-match If match % 
does not exceed minimum-match %, the IMAGE — 
MATCH subroutine returns a "false'* value to the 
IMAGE— VERIFY program in is-a-match 

Alternative standards for evaluating a match also 
may be utilized with the invention. For example, a fmal 
match of the Uve fingerprint to the master fingerprint 
could be defined to exist where the total number of 
matched master and live distance values exceeds a cer- 
tain proportion of the maximum possible number, or 
where the proportion of matched master distance spec- 
tra exceeds a certain proportion of the total number of 
master distance spectra. In fact, the evaluation standard 



accept-spectrum is the mjnimiim number of distance 
values in a master distance spectrum that must have 60 described in detail above is a hybrid of these two alter- 
found matches with distance values from a live distance natives. 

spectrum for the aster distance spectrum to be deemed • While there has been illustrated and described what is 
as matching that live distance spectrum. at present considered to be a preferred embodiment of 

minimum-match % is a preselected number represent- the present invention, it will be understood by those 
ing the minim^ni proportion of the distance values of all 65 skilled in the art that various changes and modifications 
master distance spectra that must be found in a matched may be made, and equivalents may be substituted for 
master distance spectrum and matched with a distance elements thereof without departing from the true scope 
value in the associated live distance spectrum in order of the invention. In addition, many modifications may 
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be made to adapt the teachings of the mvention to a 
particular situation without departing from the central 
scope of the invention. Therefore, it b intended that this 
invention not be limited to the particular embodiment 
disclosed as the best mode contemplated for carrying 
out this invention, but that the invention will include all 
embodiments falling within the scope of the appended 
claims. 
We claim: 

^ 1. A method of matching a live image and a master 
image having any random or nonrandom distribution of 
characteristic features throughout an image, comprising 
the steps of: 

generating a set of points representative of the char- 
acteristic features of the live image; 

generating a set of points representative of the char* 
acteristic features of the master image; and 

evaluating the match between the live characteristic 
points and the master characteristic points, said 
step of evaluating consisting essentially of the fur* 
ther steps of: 

deriving the distances between a plurality of pairs of 
the live characteristic points; 

deriving the distances between a plurality of pairs of 
the master characteristic points; and 

comparing the live distances with the master dis- 
tances to determine whether of not the live image 
matches the master image; 

wherein the steps of deriving the distances between 
pairs of the live characteristic points and deriving 30 
the distances between pairs of the master charac- 
teristic points comprise for each such step the step 
of forming, for each point in the set of points, a 
spectrum of values representing the distances be- 
tween the point and each other point in the set of 35 
points. 

2. A method according to claim 1 further comprising 
the initial steps of: ' 
providing a set of master images; and 
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deriving the distances between a plurality of pairs of * 
the live minutiae; 

providing values for the distances between a plurality 
of pairs of selected minutiae appearing inn at least 
a portion of a master fingerprint wherein the minu- 
tiae may be randomly or nonrandomly distributed 
throughout said fingerprint portion; and 

comparing the live distance with the master distances 
to determine whether or not the live fingerprint 
matches the master fingerprint; 

wherein the steps of deriving the distances between 
pairs of live minutiae and providing the distances 
between pairs of master minutiae comprise for each 
such step the step of forming, for each minutia, a 
spectrum of values representing the distances be- 
tween the minutia and each other minutia in its set 
of live or master minutiaje. , 

6. A method accordmg to claim 5 further comprising 
the steps of: 

providing a collection of sets of master minutiae ap- 
pearing in at least a portion of their associated 
master fingerprints; and 

receiving an identification of one of the sets of master 
minutiae and deriving the distances between a plu- 
rality of pairs of minutiae in that set of master minu- 
tiae. 

7. A method according to claim 5 wherein the step of 
comparing further comprises indicating a match for the 
live fingerprint if at least a predetermined proportion of 
master distance spectra are matched with live distance 
spectra, where a master distance spectrum is deemed to 
match a live distance spectrum if a predetermined pro- 
portion of the distance values in the master distance 
spectrum match distance values in the live distance 
spectrum within a predetermined tolerance. 

8. A method according to claim 5 farther comprising 
the step of generating an indication of a match if the live 
fingerprint is successfully matched with the master 



receiving an identification of one of the set of master 40 fingerprint, or alternatively generating an indication of 



images for matching with the live image. 

3. A method according to claim 1 wherein the step of 
comparing comprises comparing each of the master 
distance spectra to live distance spectra by determining, 
for each such comparison of a master distance spectnmi 45 
to alive distance spectrum, the distance values in the 
master distance spectrum that match separate distance 
values in the live distance values in the live distance 
spectrum within a predetermined tolerance. 

4. A method according to claim 3 wherein the step of 50 
comparing further comprises indicating a match be- 
tween the live image and the master image if at least a 
predetermined proportion of master distance spectra 
are matched with separate live distance spectra, where 

a master distance spectrum is deemed to match a live 55 
distance spectrum if the proportion of the distance val- 
ues in the master distance spectrum that match distance 
values in the live distance spectrum exceeds a predeter- 
mined value, 

5. A method of identifying a person's identity consist- 60 
ing essentially of the steps of: 

receiving an identification of selected live minutiae 
appearing in at least a portion of a selected finger- 
print of the person, where the minutiae are identi- 
fied by their spatial coordinates relative to a refer- 65 
ence coordinate system and may be randomly or 
nonrandomly distributed throughout said finger- 
print portion; 



the absence of a match if the live fingerprint is unsuc- 
cessfully matched with the master fingerprint. 

9. An apparatus for matching a live image and a mas- 
ter image having any random distribution of character- 
istic features throughout an image, comprising: 
a first means for receiving and storing a set of points 
representative of the characteristic features of the 
live image; 

a second means for receiving and storing a set of 
points representative of the characteristic features 
of the master image; 

computing means for receiving the live and the mas- 
ter characteristic points from the first and second 
means, deriving a distances between a plurality of 
pairs of the live characteristic points and deriving 
the distances between a plurality of pairs of the 
master characteristic points, wherein the comput- 
ing means derives, for each characteristic point in 
each of the sets of live and master characteristic 
points, a spectrum of values representing the dis- 
tances between the characteristic point and each 
other characteristic point in its set of characteristic 
points; and 

means for comparing the live distances with the mas- 
ter distances and initiating a predetermined activity 
essentially only on the basis of a match between the 
live distances and the master distances within a 
predetermined tolerance. 
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10. An apparatus according to claim 9 wherein the 
first means further comprises means for sensing the live 
image and forming a two-tone digital representation 
thereof; means for storing the digital representation; and s 
means for processing the digital representation to gener- 
ate a set of points representative of characteristic fea- 
tures of the live image* 

11. An apparatus according to claim 9, further com- iq 
prising a third means associated with the first means for 
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receiving an indication of the purported identity of the 
live image; and 
storage means for storing a plurality of sets of points 
representative of characteristic features of a plural- 
ity of corresponding master images; 
wherein the computing means is adapted to receive 
from the third means a signal identifying the pur- 
ported identity of the live image and obtain from 
the storage means a set of points for the master 

image associated with the purported identity. 
« 4 « ♦ * 
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ABSTRACT 



A method is described for identifying pages that are near 
duplicates in a linked database. In the linked database, pages 
can have incoming links and outgoing links. Two pages are 
selected, a first page and a second page. For each selected 
page, the number of outgoing links is determined. The two 
pages are marked as near duplicates based on the number of 
common outgoing links for the two pages. 

4 Claims, 2 Drawing Sheets 
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METHOD FOR IDENTIFYING NEAR graph are fetched in order to determine their relevance to the 

DUPLICATE PAGES IN A HYPERLINKED query topic. This is necessary to reduce the effect of non- 

DAXABASE relevant pages in the subsequent connectivity analysis 

phase. 

FIELD OF THE INVENTION 5 in U.S. patent application Scr. No. 09/058,577 "Method 

This invention relates generaUy to computerized informa- ^^°*^g Documents in a Hyperlinked Environment 

lion retrieval, and more particularly to identifying near using Connectivity and Selective Content Analysis" filed by 

duplicate pages in a hyperlinked database environment such ^^^^^ °° ^^^^ l^^^* ^ described which 

as the World Wide Web performs content analysis only a small subset of the pages in 

10 the neighborhood graph to determine relevance weights, and 

BACKGROUND OF THE INVENTION pages with low relevance weights are pruned from the graph. 

Then, the pruned graphed is ranked according to a connec- 

U has become common for users of host computers tjvjty analysis. This method still requires the result set of a 

connected to the World Wide Web (the "Web") to employ ^^^^ ^o form a query topic 

Web browsers and search engines to locate Web pages 15 ^^ove cases, it would be advantageous if 

having speafic content of mterest to users. A search engine .^^^ ^^^^ ^ .^^^^ ^^^j^ ^ .^^^^^^ 

such as Digital Equipment Corporation s /Ju^^^^ search ^^^^ essentially represent the same content. It 

engine, indexes hundreds of mdlions of Web pages inam- ^^^^ ^^^^ ^^^^^^^ ^ ^^^^ duplicates could be identified 

lained by computers all over the world. Tlie users of the ^thout having the analyze the detailed content of the pages, 

hosts compose queries, and the search engine identifies 20 

pages that match the queries, e.g., pages that include key SUMMARY OF THE INVENTION 
words of the queries. These pages are known as a result set. 

In many cases, particularly when a query is short or not ^'""'"^^'^ ^ f, jdeo'^ying near^duplicate pages 

well defined, the result set can be quite large, for example. ?,P}"'?^}y °}t^^^ '» " '^"^"^'^ "'^ 

thousands of pages. The pages in the result set may or may 25 World Wide Web. A first and second page are selected for a 

not satisfy the user's actual information needs. Therefore, ''."Pl''=»;« determmation. For each page the number of 

techniques have been developed to identify a smaUer set of "'"S"'"^ counted. Pages are marked as near duph- 

related Daces ^^^^^ based on the number of common outgomg Imks 

.* , . . * , « . . between the two pages. 

In one prior art technique used by the Excite search 

engine, please see "hltp://www.excite.com," users first form BRIEF DESCRIPTION OF THE DRAWINGS 
a query that attempts to specify a topic of interest. After the 

result set has been returned, the user can use a "Find 1 is » block diagram of a hyperlinked environment 

Similar" option to locate related pages. However, there the ^^^^t uses the invention; 

finding of the related pages is not fully automatic because FIG. 2 is a flow diagram of a method according to the 

the user first is required to form a query, before related pages invention. 

can be identified. In addition, this technique only works on o r% /m? n 
the Excite search engine and for the specific subset of Web 

pages that are indexed by the Excite search engine. PREFERRED EMBODIMENTS 

. L • . • I. r - System Overview 

In another prior art technique an algorithm for connec ^ 'pj^ j ^^^^ ^ j^,^^^ environment 100 where the 

7 ''k!h r'-VI" t "^'ghborhood graph (n-graph) is ^^^^^^^ ^ ^ ^ ^ ^^^^^ environment is an 

described by Kleinberg m "Authonlative Sources m a a„a„gen,ent of client computers UO and server computers 

Hyperhnked Environment, Proc. 9th ACM-SIAM Sympo- ^^ ^^^^ ^^^^„y ^ ^^^^ ^ ^ 

Slum on Discrete /Ugoruhms, 998, and a so in IBM ^^^^3^^ , .^e Internet. The network 130 

Research Report RJ 10076. M^y 1997, see, http:/^wwwxs includes an application level interface called the World Wide 

,oornell.edu/Info/People/klember/auth.ps. The algorithm ^^^^ "V^b"") 131 

f°^'y^.^,V'f,5^^T' ^lY^*' P'Sfs "in Th^ 1^^^'^ no ^^^^ documents, 

the vicmity^ of the result set to suggest useftil pages m the ^ j multi-media Web pages 121 maintained by the 

context of the search that was performed. ^^^^^ ^^0. TypicaUy, this is done with a Web browser (b) 

The vicmity of a Web page is defined by the hyperlinks 50 ^4 executing in the client 110. The location of each page 

that connect the page to others. A Web page can pomt to 12I is indicated by an associated Universal Resource Loca- 

other pages, and the page can be pointed to by other pages. (uRg 122. Many of the pages include "hyperlinks" 123 

Qose pages are directly Unked, farther pages are mdirectly ^^^^^ p^g^^ -j^^ hyperlinks are also in the form of URLs. 

Unked. This connectivity can be expressed as a graph where Although the invention is described with respect to docu- 

nodes represent the pages, and the directed edges represent 55 m^nts that are Web pages, it should be understood that our 

the links. The vicinity of all the pages in the resuh set is invention can also be worked with any linked data objects of 

called the-neighborhood graph. ^ database whose content and connectivity can be charac- 

Specifically, the Kleinberg algorithm attempts to identify terized. 

"hub" and "authority" pages in the neighborhood graph for Iq order to help users locate Web pages of interest, a 

a user query. Hubs and authorities exhibit a mutually rein- go search engine 140 can maintain an index 141 of Web pages 

forcing relationship. in a memory, for example, disk storage. In response to a 

In U.S. patent appfication Ser. No. 09/007,635 "Method query 111 composed by a user using the Web browser (B) 

for Ranking Pages Using Connectivity and Content Analy- 114, the search engine 140 returns a result set 112 which 

sis" filed by Bharat et al. 00 Jan. 15, 1998, a method is satisfies the terms (key words) of the query 111. Because the 

described that examines both the connectivity and the con- 65 search engine 140 stores many milhons of pages, the result 

tent of pages to identify useful pages. However, the method set 112, particularly when the query 111 is loosely specified, 

is relatively slow because all pages in the neighborhood can include a large number of qualifying pages. 
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These pages may, or may not related to the user's actual 
information need. Therefore, the order in which the result 
112 set is presented to the client 110 is indicative of the 
usefulness of the search engine 140. A good ranking process 
will return only "useful" pages before pages that are less so. 

We provide an improved ranking method 200 that can be 
implemented as part of a search engine 140. Alternatively, 
our method 200 can be implemented by one of the clients 
110 as part of the Web browser 114. Our method uses 
content analysis, as well as connectivity analysis, to improve 
the ranking of pages in the result set 112 so that just pages 
related to a particular topic are identified. 
Introduction 

Our invention is a method tliat takes an initial single 
selected Web page 201 as input, and produces a subset of 
related Web pages 113 as output. Our method works by 
examining the "neighborhood" surrounding the initial 
selected page 201 in a Web neighborhood graph and exam- 
ining the content of the initial selected page and other pages 
in the neighborhood graph. 

Our method relies on the assumption that related pages 
will tend to be "near" the selected page in the Web neigh- 
borhood graph, or that the same keywords will appear as part 
of the content of related pages. The nearness of a page can 
be expressed as the number of links (K) that need to be 
traversed to reach a related page. 

FIG. 2 shows the steps of a method according to our 
invention. As slated above, the method can be implemented 
as a software program in either a client or server computer 
In either case, the computers 110, 120, and 140 include 
conventional components such a processor, memory, and 
I/O devices that can be used to implement our method. 
Building the Neighborhood Graph 

We start with an initial single selected page 203, i.e., the 
page 201 includes a topic which is of interest to a user. The 
user can select the page 201 by, for example, giving the URL 
or "clicking" on the page. It should be noted that the initial 
selected page can be any type of linked data object, text, 
video, audio, or just binary data as stated above. 

We use the initial page 201 to construct 210 a neighbor- 
hood graph (ngraph) 211 in a memory. Nodes 212 in the 
graph represent the initial selected page 201 as well as other 
closely linked pages, as described below. The edges 213 
denote the hyperlinks between pages. The "size" of the 
graph is determined by K which can be preset or adjusted 
dynamically as the graph is constructed. The idea being that 
the graph needs to represent a meaningful number of page. 

During the construction of the neighborhood graph 211, 
the direction of links is considered as a way of pruning the 
graph. In the preferred implementation, with K=2, our 
method only includes nodes at distance 2 that are reachable 
by going one link backwards ("B"), pages reachable by 
going one link forwards ("P"), pages reachable by going one 
link backwards followed by one link forward ("BF") and 
those reachable by going one link forwards and one link 
backwards ("FB"). This eliminates nodes that are reachable 
only by going forward two links ("FF**) or backwards two 
links ("BB"). 

To eliminate some unrelated nodes from the neighbor- 
hood graph 211, our method relies on a list 299 of "stop" 
URLs, which are URLs that are so popular that they are 
highly referenced from many, many pages, such as popular 
search engines. An example is "www.altavista.digital.com." 
These "stop" nodes are very general purpose and so are 
generally not related to the specific topic of the selected page 
201, and so serve no purpose in the neighborhood graph. Our 
method checks each URL against the stop list 299 during the 
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neighborhood graph construction, and eliminates the node 
and all incoming and outgoing edges if a URL is found on 
the stop list 299. 
In some cases, the neighborhood graph becomes too large. 

S For example, highly popular pages are often pointed to by 
many thousands of pages and including all such pages in the 
neighborhood graph is impractical. Similarly, some pages 
contain thousands of outgoing links, which also cause the 
graph to become too large. Our method filters the incoming 

10 or outgoing edges by choosing only a fixed number M of 
them. In our preferred implementation, M is 50. In the case 
that the page was reached by a backwards link L, and the 
page has more than M outgoing links, our method chooses 
the M links that surround the link L on the page. 

15 In the case of a page P that has more than M pages 
pointing to page P, our method will choose only M of the 
pages for inclusion in the neighborhood graph. Our method 
chooses M pages from a larger set of N pages pointing to 
page P by selecting the M pages with highest in-degree in the 

20 graph. The idea being that pages with high in-degree are 
likely to be of higher quality than those with low in-degree. 

In some cases, two pages will have identical contents, or 
nearly identical contents. This can happen when the page 
was copied, for example. In such cases, we want to include 

25 only one such page in our neighborhood graph, since the 
presence of multiple copies of a page will tend to artificially 
increase the importance of any pages that they point to. We 
collapse duplicate pages to a single node in the neighbor- 
hood graph. There are several ways that one could identify 

30 duplicate pages. 

One way would be to examine the textual content of the 
pages to see if they are duplicates or near-duplicates, as 
described by Broder et al. in "Method for clustering closely 
resembling data objects," file Mar. 26, 1998. Another way, 

35 that is less computationally expensive and which does not 
require the content of the page, is to examine the outgoing 
links of two pages. If there are a significant number of 
outgoing links and they are mostly identical, these pages are 
likely to be duplicates. We identify this case by choosiiig a 

40 threshold number of links Q. Pages PI and P2 are considered 
near duplicates if both PI and P2 have more than Q links, 
and a large fraction of their links are present in both PI and 
P2. 

Relevancy Scoring of Nodes in the Neighborhood Graph 

45 We next score 220 the content of the pages represented by 
the graph 211 with respect to a topic 202. We extract the 
topic 202 from the initial page 201. 

Scoring can be done using well known retrieval tech- 
niques. For example, in the Salton & Buckley model, the 

50 content of the represented pages 211 and the topic 202 can 
be regarded as vectors in an n-dimensional vector space, 
where n corresponds to the number of unique terms in the 
data set. A vector matching operation based on cosine of the 
angle between vectors is used to produces scores 203 that 

55 measure similarity. Please see, Salton et al., "Term- 
Weighting Approaches in Automatic Text Retrieval,", Infor- 
mation Processing and Management, 24(5), 513-23, 1988. A 
probabilistic model is described by Croft et al. in "Using 
Probabilistic Models of Document Retrieval without Rel- 

60 evance Feedback," Documentation, 35(4), 285-94, 1979. 
For a survey of ranking techniques in Information Retrieval 
see Frakes et al., "Information Retrieval: Data Structures & 
Algorithms," Chapter 14 — 'Ranking Algorithms,* Prentice- 
Hall, N.J., 1992. 

65 Our topic vector can be determined as the term vector of 
the initial page 201, or as a vector sum of the tenn vector of 
the initial selected page and some fimction of the term 
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vectors of all the pages presented in the neighborhood graph 
211. One such function could simply weight the term vectors 
of each of the pages equally, while another more complex 
function would give more weight to the term vectors of 
pages that are at a smaller distance K from the selected page 
201. Scoring 220 results in a scored graph 215. 
Pruning Nodes in the Scored Neighborhood Graph 

After the graph has been scored, the scored graph 215 is 
"pruned" 230 to produce a pruned graph 216. Here, pruning 
means removing those nodes and links from the graph that 
are not "similar." There are a variety of approaches which 
can be used as the threshold for pruning, including median 
score, absolute threshold, or a slope-based approach. 
* Connectivity Scoring the I'runed Graph 

In step 240, the pruned graph is scored again, this time 
based on connectivity. This scoring effectively ranks the 
pages, and pages above a predetermined rank can be pre- 
sented to the user as the related pages 113. 

One algorithm which performs this scoring is the Klein- 
berg algorithm mentioned previously. This algorithm works 
by iteraiively computing two scores for each node in the 
graph: a hub score (HS) 241 and an authority score 242. The 
hub score 241 estimates good hub pages, for example, a page 
such as a directory that points to many other relevant pages. 
The authority score 242 estimates good authority pages, for 
example, a page that has relevant information. 

The intuition behind Kleinberg's algorithm is that a good 
hub is one that points to many documents and a good 
authority is one that is pointed to by many documents. 
Transitively, an even better hub is one that points to many 
good authorities, and an even better authority is one that is 
pointed to by many good hubs. 

Bharat et al. have come up with several improved algo- 
rithms that provide more accurate results than Kleinberg's 
algorithm, and any of these could be used as in step 240. 
Differences with the Prior. Art 

Our method differs from prior art in the graph building 
and pruning steps. 

A simple prior art building method treated the n-graph as 
an undirected graph and used any page within a distance K 
to construct the graph. 

Refinements to this method considered the graph as 
- directed and allowed a certain number of backward hyper- 
link traversals as part of the neighborhood graph construc- 
tion. Notice, this refinement required backwards connectiv- 
ity information that is not directly present in the Web pages 
themselves. 

This information can be provided by a server 150, such as 
a connectivity server or a search engine database, see U.S. 
patent application Ser. No. 09/037,350 "Connectivity. 
Server" filed by Broder et al. on Mar. 10, 1998. Typical 
values of K can be 2 or 3. Alternatively, K can be determined 
dynamically, depending on the size of the neighborhood 
graph, for example, first try to build a graph for K»2, and if 
this graph is not considered large enough, use a larger value 
for K. 

There are two differences in oiu" method. First, we start 
with a single Web page as input, rather than the result set 
produced by a search engine query. The second difference 
deals with how the initial neighborhood graph 211 is con- 
structed. KJeinberg includes all pages that have a directed 
path of length K from or to the initial set. 

In contrast, we look at the Web graph as an undirected 
graph and include all pages that are K undirected links away 
from our initial selected age. This has the benefit of includ- 
ing pages that can be reached by an "up-down" path tra- 
versals of the graph, such as pages that are both indexed by 
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the same directory page, but which are not reachable from 
each other using just a directed path. 

In the presence of useful hub pages, pages that point to 
many related pages, our approach will include all of the 
5 related pages referenced by the hub which might be similar 
to the selected page 201 in our neighborhood graph. 
Pruning 

Our method differs from the Kleiiiberg method because 
there no pruning of the neighborhood graph was performed. 

10 Bharat et al. improved the Kleinberg method by pruning the 
graph to leave a subset of pages which are fed to the ranking 
step to yield more accurate results. 

However, because we start with a single Web page, rather 
than with a results from a query, we do not have an initial 

15 query against which to measure the relevance of the related 
pages. Instead, we use the content of the initial page, and 
optionally the content of other pages in the neighborhood 
graph to arrive at a topic vector 
Advantages and Applications 

20 Our invention enables automatic identification of Web 
pages related to a single Web page. Thus, if a user locates 
just one page including an interesting topic, then other pages 
related to the topic are easily located. According to the 
invention, the relationship is established through the use of 

25 connectivity and content analysis of the page and nearby 
pages in the Web neighborhood. 

By omitting the content analysis steps of our method, the 
method is able to identify related URLs for the selected page 
201 solely through connectivity information. Since this 

30 information can be quickly provided by means of a connec- 
tivity server 150, the set of related pages can be identified 
without fetching any pages or examining the contents of any 
pages. 

One application of this invention allows a Web browsers 
35 in a client computer to provide a '^Related Pages" option, 
whereby users can quickly be taken to any of the related 
pages. Another application is in a server computer that 
implements a Web search engine. There, a similar option 
allows a user to list just related pages, instead of the entire 
40 result set of a search. 
We claim: 

1. A method for identifying pages that are near duplicates 
in a linked database, the pages in the database having 
incoming links and outgoing links, comprising the steps of: 

selecting a first page and a second page; 
determining the outgoing links for the first page and the 
second page; 

determining the number of outgoing links that are com- 
50 mon for the first page and the second page; 

marking the first page and the second page as near 

duplicate pages based on the number of common 

outgoing links. 

2. The method of claim 1 wherein the number of common 
55 outgoing links is the intersection of the outgoing links of the 

first and second pages. 

3. The method of claim 1 wherein the first and second 
pages are near duplicate pages when the ratio of the number 
of common outgoing links divided by the imion of the 

60 outgoing links of the first and second pages is larger than a 
predetermined threshold. 

, 4. The method of claim 1 wherein the first and second 
pages are near duplicate pages \yhen the ratio of the number 
of common outgoing links divided by the total number of 

65 outgoing links is larger than a- predetermined threshold. 

***** 
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Abstract (Basic): JP 2000275338 A 

NOVELTY - Amount calculation unit (8) calculates target direction 
and target amount of characteristics based on produced target 
image . Amount calculation unit (10) calculates target candidates 
amount of characteristics . Amount comparator (11; compares the 
calculated results and outputs an identification result. 

DETAILED DESCRIPTION - Target image production unit (7) processes 
the image to radar image based on distance resolution, to produce 
target image . Reference image production unit (9) produces 
reference image from preset target candidate 3D data based on 
projection angle of signal and distance resolution of radar receiving 
unit (1). An INDEPENDENT CLAIM is also included for target 
identification procedure. 

USE - waterborne target identification device for identifying 
velocity, position, flight path, etc. 

ADVANTAGE - Evaluates fixed quantity target and identification 
processing time of target is carried out in short time. 

DESCRIPTION OF DRAWING(S) - The figure (containing non-English 
text) shows the block diagram of waterborne target identification 
devi ce . 
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Abstract (Basic): JP 10247246 A 

The method involves producing an objective reference image in a 
learning unit (10). The amount of colour characteristics of the 
reference image is calculated in a calculation unit (40). The amount 
of colour characteristics for each partial area of a strange Image 
is also calculated in the calculation unit. 

A detector unit (20) compares the colour characteristics of the 
strange image with that of the reference image for detecting the 
existence of search object in the strange image along with its 
position. The detected result in output to an output unit. The 
calculation unit uses a histogram obtained from the ratio of a pixel to 
its colour value for calculating the colour characteristics. 

ADVANTAGE - IS Utilised under different illumination environments. 
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Abstract (Basic): JP 11337312 A 

NOVELTY - A discrimination unit (8) distinguishes whether the 
correspondence relationship between attention pixel of a standard image 
and corresponding pixel of another image is correct. A compensation 
unit (9) adjusts the parallax data which is required by congruent point 
search unit (6), when discrimination result is incorrect. DETAILED 



DESCRIPTION - Amount extraction units (5l,5r) extract the amount of 
characteristics in images obtained from input units (4l,4r), one of 
which is a standard image. A pixel of specific image corresponding to 
attention pixel of standard image. A pixel of specific image 
corresponding to attention pixel of standard image, is searched by 
congruent point search unit (6) . A calculation unit (7) calculates the 
distance data of target object based on parallax data between 
both pixels. 

USE - For target object distance data measurement in image 
processor. 

ADVANTAGE - Improves accuracy of congruent point search, hence 
obtains accurate distance data. DESCRIPTION OF drawing(S) - The figure 
shows the block diagram of component of principal part of target 
object distance data measurement unit. (4L,4R) image input units; 
(5l,5r) Amount extraction units; (6) Congruent point search unit; (7) 
Calculation unit; (8) Discrimination unit; (9) Compensation unit. 
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Abstract (Basic): JP 2000010989 A 

NOVELTY - A search unit (15) uses index to search object opposing 
reference object, based on number determined by weighting coefficient 
difference between amount variety of characteristics . From vicinity 
objects , candidate objects are collected. Similarity is judged, 
based on distance between reference object and weightina 
coefficient of candidate. The candidate objects are set in order . 

DETAILED DESCRIPTION - A Storage unit (10) stores the amount 
variety of the characteristics of object as a point of 
multidimensional vector space. An index storing unit (12) stores 
index for data search. A reference object input unit (131) is used to 
input the reference object, in the designation unit (133), the user 
designates the number of similar objects. Weight designation unit (132) 
finds weight between the amount variety of the characteristic of 
objects . The amount calculation unit (11) compares amount of the 
characteristics of the reference object , based on the distance 
between the reference object and similar object , stored as a 



multidimensional vector space in storage unit, independent CLAIMS are 
also included for the following: 

(a) similar object search apparatus; 

(b) similar object search program stored in recording medium. 
USE - In electron museum, electron catalog to search object 

similar to reference objects such as image, audio, music, text 
etc. 

ADVANTAGE - As the amount calculation unit calculates amount 
variety of characteristics of object based on the distance 
between reference object and similar object, labor for distance 

calculation is saved. Searches similar object with arbitrary weight 
with reduced time. 

DESCRIPTION OF DRAWING(S) - The figure shows the block diagram of 
the similar object search apparatus. 

Storage umt (10) 

Amount calculation unit (11) 

Index storing unit (12) 

Vicinity object search unit (15) 

Reference object input unit (121) 

Weight designation unit (132) 

Number designation unit (133) 
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Abstract (Basic): JP 11053386 A 

NOVELTY - A characteristic amount memory stores the amount of 
characteristics of the Image of the target objects of Isimllar 
shape. A characteristic amount search unit searches the Image for 
search, based on the characteristic amount of the imaae for search. 
DETAILED DESCRIPTION - INDEPENDENT CLAIMS are al SO included for the 
foil owing: an Image search method; and an image search program recording 
medium. 

USE - For searching e.g. face image read from photograph when 
inserting face image to e.g. document. For e.g. portrait production 
apparatus. 

ADVANTAGE - Similarity of images can be judged objectively and 
quantitatively on the basis of the amount of characteristics e.g. 



size and angle of the Image of the target object . improves 
hitting ratio in searching known imaqe, standard image or similar 
image. DESCRIPTION OF drawing(S) - The figure shows the block diagram 
of a portrait production apparatus. 
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Abstract (Basic): EP 519737 A 

The method involves creating an image of the character and reducing 
the image of the character to a skeleton image. The skeleton image of 
the character is represented on the basis of internal structure 
corresponding to a number of nodes, and connections between the number 
of nodes. The representation of the skeleton image of the character is 
stored as the representation of the internal structure of the 
character. 

The internal structure is represented as a linked list with each 
of the number of nodes corresponding to an entry in the list, and each 
of the connections between them to a painter to another entry in the 
list. 

USE/ADVANTAGE - For recognition of handwritten characters. Highly 



efficient. 
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ABSTRACT 

PROBLEM TO BE SOLVED: TO provide an image similarity discriminating device 
for specifying a common area between an original author image and a 
subject Image which supposedly approximates the original author image. 

SOLUTION: This image similarity discriminating device 1 is constituted by 
providing a data input part 11, a statistical amount calculating part 12, a 
normalization processing part 13, a deviation value processing part 14, a 
common area discriminating part 15 and a result output part 16. After the 
original author image and the subject image are normalized by statistical 

amount of each ima^e characteristic at the normalization processing 
part 13, the area with a large difference amount in characteristics 

between both images is specified, and the specified area is removed at 
the deviation value processing part 14. Then a distance value between both 
the images is calculated by calculating the statistical amount and 
performing normalization processing again. The number of common areas is 
discriminated by comparing the distance value and a threshold value by the 
common area discriminating part 15. When many common areas exist, 
information regarding the subject images is outputted to the result output 
part' 16 and then visualized. 

COPYRIGHT: (C)1999,JP0 



13/5/40 (Item 40 from file: 347) 

DIALOG(R)File 347:japio 

(c) 2006 JPO & JAPIO. All rts. reserv. 

06097813 **lmage available** 

METHOD AND DEVICE FOR RETRIEVING IMAGE AND RETRIEVAL SERVICE UTILIZING IT 

PUB. NO.: 11-039332 [jP 11039332 A] 
PUBLISHED: February 12, 1999 ( 19990212) 
INVENTOR(S): MUSHA YOSHINORI 



1 



HIROIKE ATSUSHI 

MORI YASUHIDE 
APPLICANT(s) : HITACHI LTD 
APPL. NO.: 09-196154 [DP 97196154] 
FILED: July 22, 1997 (19970722) 

INTL CLASS: G06F-017/30; G06T-001/00; G06T-007/00 

ABSTRACT 

.PROBLEM TO BE SOLVED: To efficiently retrieve a desired image from an 
image database by calculating integrated similarity from a 
characteristic amount that is extracted from a reference image and 

each characteristic amount that is preliminarily assigned to a retrieved 

image. 

SOLUTION: A person who retrieves designates a specific area of a reference 
image through a GUI of an input operation image display 209, also 
designates its characteristic amount 201 and inputs its weight 204, etc. 
Integrated similarity 203' is generated by matching the amount 201 of an 

image to an image characteristic amount of an image database 205, 
acquiring similarity 203 in each characteristic amount and weighting a 

characteristic amount in each reference image . After that, a sort step 
208 performs rearrangement in order of large integrated similarity, its 
retrieval result data name is sent to the display 209 and an image layout 
is generated. The data names is sent as a read request 212 for a retrieval 
result image to an image database 205, and image data 210 is sent to the 
display 209. 

COPYRIGHT: (C)1999,JP0 
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ABSTRACT 

PROBLEM TO BE SOLVED: TO save a labor for operating distance calculation 
for all objects, and to retrieve a similar object by arbitrary weight. 

SOLUTION: Coordinator part 16 designates characteristic amounts VRi of the 
(i)th kind of a reference object and a number f(K) for designating the 

number of neighborhood objects for each characteristic amount kind 
(i), and requests neighborhood object retrieval to a neighborhood object 
retrieving part 15. The neighborhood obiect retrieving part 15 searches the 
f(K) pieces of neighborhood objects with a short spatial distance between 
points corresponding to the reference objects in the multi-dimensional 

vector space for each characteristic amount kind (i) by using an index, 
and returns them to the coordinator part 16. The coordinator part 16 
prepares a candidate object group^ by gathering the neighborhood objects 

returned far all the characteristic amount kinds. Next, when a distance 



(dki) between points VRi and Obki corresponding to the reference 
objects for each characteristic amount kind is not searched for a 
candidate object obk, the insufficient amounts are calculated. 
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ABSTRACT 

PURPOSE: TO retrieve similar picture . independently of the 

characteristic of a sampled reference picture and to improve the 
retrieval accuracy by updating the reference picture every time when 
the similar picture is detected in retrieving the similar picture. 

CONSTITUTION: A reference picture setting part 1 sets a reference picture 
rO from the picture inputted by a picture input part 0. A part 2 setting 
and updating the feature amount sets the feature amount Temp (Rj) of 
the reference picture . On the other hand, a part 4 selecting the picture 
to be retrieved selects the picture to be retrieved Si (i=l...n) from the 
inputted picture a part 5 setting the feature amount sets the 

feature amount Temp (Si). The two feature amounts Temp (Rj) and the 
Temp (Si) are inputted to a similarity arithmetic part 6 and the degree of 
similarity is measured. The degree of similarity is estimated' by a part 7 
judging the similarity, when the degree of similarity is high, the 
retrieval picture is judged to belong to the same cluster with the 
reference picture, then the processing of a* cluster picture recording 
part 8 is performed. The feature amount Temp (Rj) of the reference 

picture is updated every time when the similar picture is detected. 
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ABSTRACT 

PURPOSE: TO provide the method and the device which are capable of high- 
precision picture retrieval matched to user's intention. 

constitution: Appendant information related to a picture to be retrieved is 
inputted from a retrieval appendant information input part 43 and is 
compared with appendant information in the appendant information storage 
part of a data base to select a matched candidate of picture data. Plural 

example pictures are inputted from an example picture input part 41, 
and plural feature quantity data are extracted by a feature quantity 
calculating part 2. Distances between these feature quantity data and the 

feature quantity of the selected picture stored in a feature quantity 
storage part 33 are calculated. These distances are sorted in the 
descending order by a candidate order determining part, and corresponding 
picture data is displayed on a display part 7. 
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ABSTRACT 

PURPOSE: To enable the high-speed processing of image pattern recognition 
by shortening the time required for recognition algorithm development and 
further reducing redundant data in data used for discrimination. 
CONSTITUTION: When a sufficient number of sample images for the 
discrimination are supplied by discrimination classes, a ternary converting 
device la converts each pixel brightness value into a ternary value and a 
Hadamard's transforming device 2a performs Hadamard's transformation to 
generate feature quantity vectors A templet generating device 3 
generates a templet from the feature quantity vectors and stores it in 
a memory 4. Once an image to be discriminated is supplied, a ternary 
converting device lb converts each pixel brightness value into a ternary 
value similarly to the sample image and a Hadamard's transforming 
device 2b performs Hadamard's transformation. A matching extent calculating 
device 5 calculates the extent of matching of the feature quantity 
vectors of the image to be discriminated with the templet to decides a 
class. 
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ABSTRACT 

PURPOSE: To eliminate the need to extract a feature quantity and to 
estimate conversion parameters in a short time by converting a reference 
image by density gradation conversion and geometric conversion and then 
estimating an image so that the reference image matching with an 
input image, 

CONSTITUTION: The image conversion is represented with an unknown pattern 
vectors (x) consisting of conversion parameters x(sub l)-x(sub 6) Then an 
area wherein variation in the density of the estimated image exceeds a 
specific value is selected, in the area, intermediate parameters by the 
partial differentiation of the estimated image as to the respective 
parameters are calculated and a normal equation is found from residual to 
calculate the corrected vector .delta.x' of the unknown parameter vector 
x(sup (k)). Then it is decided whether or not the image is converged by 
using the degree of coincidence between the estimated image based upon 
the unknown parameter vector and the input image . when not, similar 
processing is repeated to approximate the reference image to the input 
i mage . 
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...SPECIFICATION having a reliability equal to or higher than the 
predetermined threshold rth is counted. The ratio m/n of the number m 
of corresponding points equal to or higher than the threshold rth to 
the total number n is compared... 

...CLAIMS the image fixed-synthesis section perform image synthesis so that 
the first and second synthesized images match with each other in. 
the position of the road surface. 
5. The device of Claim... 

...is provided with an obstacle sensor, and 

the synthesis scheme selection section selects a synthesized image 

by additional use of distance information indicating a distance 

value from an obstacle obtained from the obstacle sensor. 
8. The. . . 
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Reducing accumulated systematic errors in image correlation 

displacement ( Speckle ) with sub-pixel resolution 
Reduction des fautes systematiques accumulees par la correlation des 

images deplacees ( Speckle ) avec une resolution sub-pixel 

. . .ABSTRACT Al 

A reference image updating method and apparatus used in an image - 
correlation system which updates a reference image when predetermined 
control parameters are met. An image corresponding to a displacement of a 
surface. . . 

...this manner, systematic errors are prevented from accumulating thereby 
significantly removing systematic errors in the image - correlation 
system. 

...SPECIFICATION trough, depending on how the pixel -by-pixel comparison is 
performed, in the plot of correlation function value points . The 
offset amount corresponding to the peak or trough represents the 
amount of displacement or deformation between the reference... 

...systematic displacement estimation errors present when conventional 
sub-pixel estimation methods are applied to a number of correlation 
function value points , especially when the correlation function 
value points are arranged asymmetrically. However, the systems and 
methods disclosed in the 671 application fail to... 

. . .CLAIMS Al 

1. A method for reducing accumulated systematic displacement errors in an 
image - correlation -based displacement measuring system, 



comprising: 

determining at least one reference-class displacement between the two... 

.one corresponding reference-class image pair based on a pre-determined 
error characteristic of the image - correlation -based displacement 
measuring system, b) acquiring the second image of the at least one 
corresponding. . . 

-to the prescribed displacement, based on the operating characteristics 
and current operating state of the image - correlation -based 
displacement measuring system; and 

wherein, for at least two reference-class image pairs: 

a ■ • • 

.reference-class image pairs is determined partly based on a 

predetermined error characteristic of the image - correlation -based 
displacement measuring system and partly based on the difference 
determined for the at least... 

.reference-class image pairs is determined partly based on a 

predetermined error characteristic of the image - correlation -based 
displacement measuring system and partly based on the difference 
determined for the at least... 

.to the prescribed displacement, based on the operating characteristics 
and current operating state of the 'image - correlation -based 
displacement measuring system. 

10. The method of claim 9, wherein acquiring the second image... a 

corresponding movement of a surface which moves relative to a sensing 
device of the image - correlation -based displacement measuring 
system. 

15. me method of claim 1, wherein at least one reference-class 
displacement is determined to a sub-pixel resolution during real-time 
operation of the image - correlation -based displacement measuring 
system. 

16. The method of claim 1, wherein at least one reference-class 
displacement is determined to a sub-pixel resolution during real-time 
operation of the image - correlation -based displacement measuring 
system, and that at least one reference-class displacement and the 
corresponding reference-class image pair is recorded in the image - 
correlation -based displacement measuring system, for use during 
subsequent real-time operation of the image - correlation -based 
displacement measuring system. 

17. The method of claim 1, wherein at least one reference.. - 

.sub-pixel resolution by a prescribed procedure prior to subsequent 

real-time operation of the image - correlation -based displacement 
measuring system, and that at least one reference-class displacement 
and the corresponding reference-class image pair is recorded in the 
image - correlation -based displacement measuring system, for use 
during subsequent real-time operation of the image - correlation 
-based displacement measuring system. 

18. A method for reducing accumulated systematic displacement errors in 
an image - correlation -based displacement measuring system, 
comprising: 

determining at least one reference-class displacement between the two... 

.class image pair that corresponds to a fractional part of the 
pixel -spacing of the image - correlation based displacement 
measuring system, the compensation based on a predetermined periodic 
error characteristic of the image - correlation -based displacement 
measuring system, b) acquiring the second image of the at least one 
corresponding... 
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. . -ABSTRACT A2 

An image search system for determining a similarity- of an image 

whose feature are represented by either one of image features amounts, a 
color distribution features or a frequency distribution features, to 
search for a similar image , including a to-be-searched image features 
storage unit (60) for referring to data of... 

...and the image features amount of each image to be searched based on the 
converted Image features amount and determining a similarity of each 
image to search for a similar image . 

...SPECIFICATION be compared (searched) should be prepared for the images. 
In addition, even when an image features amount of a kind common 
to both the images is provided, a function of conducting comparison and 
search based on the image features should be further provided in... 

...CLAIMS A2 

1. An image search system for determining a similarity of an image 

whose feature are represented by either one of image features 
amounts, a color distribution features or a frequency distribution 
features, to search for a similar image , comprising: 
means (10) for converting, with respect to an image set to be a target 

...the image features amount of each said image to be searched based on 

said converted image features amount and determining a similarity 
of each image to search for a similar image . 

2. The image search system as set forth in claim 1, further comprising 
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...SPECIFICATION of the data elements at a certain position in two 
sequences or partial sequences. This function may especially depend on 
the number of identical data elements succeeding one another in two 
partial sequences in said data sets. 

The invention also provides. . .written text. 

In the context of the present invention, a preferred distance measure 
is a function related to the number of common data elements . This 
function is. usually defined in such a manner that identical data sets 
have a distance zero... 

...CLAIMS 3, characterized by the step of controlling a display device on 
the basis of said correlation data to create a graphic symbolic 
display of clusters at one or more levels. 
5. Method according to one of... 

...16, characterized in that said data sets comprise genetic information 

and said distance is a function of the number of identical data 

elements succeeding one another in two partial sequences in said 
data sets. 
19. Method according to... 
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...SPECIFICATION of page accesses, with np)) rows (the total number of 
pages) and nu)) columns (the number of users). Each column corresponds 

to a vector generated by the function (phi)p)), the derivation of 
which is described in detail above. For example, the fifth... 

...CLAIMS associating the at least one vector with the object. 

5. A method for calculating the similarity between two objects in a 

collection of objects, wherein each object is associated with at 
least one multi . . . 

...a first object and a second vector corresponding to a first feature of a 
second object ; and 
computing a first distance metric between the first vector and the 
second vector. 

6. A method for calculating the similarity between two objects in a 

collection of objects, wherein each object is associated with a 
plurality of multi . . . 

...first vector corresponding to a first object and a second vector 
corresponding to a second object , 
for each feature, computing a distance metric between the first vector 
and the second vector; and 



summing the distance metrics for. 
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. . .ABSTRACT Al 

The present invention has an object to provide a matched filter 
with further reduced electric power. In a matched filter circuit 
according to the present... 

...SPECIFICATION the user. The number of codes included in each spreading 
code is defined as "spreading ratio " equal to a number of taps or a 
number of multiplication portions of the matched filter. 

On the mobile communication, multi-path signals may reach the receiver 
consisting of a. . . 

...SPECIFICATION the user. The number of codes included in each spreading 
code is defined as "spreading ratio " equal to a number of taps or a 
number of multiplication portions of the matched filter. 
On the mobile communication, multi-path signals may reach the receiver 



consisting of a. 
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...CLAIMS corresponding points on the basis of block matching; 

calculation means for calculating coordinates including a 
distance to an object to be sensed on the basis of the detected 
corresponding points; and 

image generating means... 

...corresponding points on the basis of block matching; 

calculation means for calculating coordinates including a 
distance to an object to be sensed on the basis of the detected 



corresponding points; and 

image generating means... 

.by said first detection means at first and second times upon 

determining that an evaluation function including a predetermined 
feature amount of the corresponding points at the first and 
second times and the movement information is smaller than a 
predetermined. . . 

.amount includes basic color components. 

17. The apparatus according to claim 14, wherein the evaluation 

function is a function of a distance between a feature amount 
of corresponding points at the first time and a feature amount 
of positions obtained by correcting positions of .. .corresponding 
point detection means comprises: 

stereo image corresponding point detection means for 
detecting, by block matching , corresponding points between stereo 
images which are sensed from different viewpoints at the same time; 

left image corresponding point detection... 
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...ABSTRACT of information defined in digital form comprises transforming 
the clear file, containing said information, to graphic equivalent 
form, transmitting and/or storing the same in such graphic - equivalent 
form and bringing it back to digital form. An article of manufacture is 
also provided which consists of the graphic - equivalent form of a 
computer file defined on a backing, in a particular form of the... 

...SPECIFICATION conventionally accepted as representing file elements, 
when these latter appear, should be less than the number of possible 
arrays having the same number of component bits, and more 
preferably the ratio of the two numbers should be at least 64, 
preferably at least 128 and still .. .conventionally accepted as 
representing file units, when these latter appear, should be less than 
the number of possible arrays having the same number of component 
bits, and more preferably the ratio of the two numbers should be at 
least 64, preferably at least 128 and still... 
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...ABSTRACT 0", it is determined that the point pair combination associated 
with the neuron is not matched . (see image in original document) 

...SPECIFICATION the characteristic points can be determined. It has been 
found from our experiments that the number of iterations necessary for 
matching between the characteristic points , i.e., the number of t 
updating times is proportional to the number n of characteristic 
points. This is considered due to the effect of... 

...SPECIFICATION the characteristic points can be determined. It has been 
found from our experiments that the number of iterations necessary for 
matching between the characteristic points , i.e., the number of t 
updating times is proportional to the number n of characteristic 
points. This is considered due to the effect of... 
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...SPECIFICATION of a given number of sample points of greatest magnitude 
in said low resolution correlation function , 

iv) applying the locations found for said given number of sample 
points to said higher resolution correlation function to identify 
the corresponding sample points in said higher resolution correlation 
function , 

v) determining the positions of peaks associated with the said 
number of sample points in said higher resolution correlation 
function which positions are defined to sub-sample interval accuracy. 

Such a method has the advantage... 

...of a given number of sample (joints of greatest magnitude in said low 
resolution correlation function , means for applying the locations found 
for said given number of sample points to said higher resolution 
correlation function to identify the corresponding sample points in 
said higher resolution correlation function, and means for determining 
the positions of peaks associated with the said number of sample 
points in said higher resolution correlation function which 
positions are defined to sub-sample interval accuracy. 
The apparatus may be further characterised... 

...CLAIMS for the production of motion vectors, said method being 
characterised by the steps of:- 

i) correlating two pictures to determine low resolution 
correlation as a function of displacement thereby to determine 
sample correlation values to a low resolution, 

ii) correlatina said two pictures to determine higher 
resolution correlation as a function of displacement thereby to 
determine sample correlation values to a higher resolution... 

...of a given number of sample points of greatest magnitude in said low 
resolution correlation function , 

iv) applying the positions found for said given number of sample 
points to sand higher resolution correlation function to 
identify the corresponding sample points in said higher resolution 



correlation function , 

v) determining the locations of peaks associated with the said 
number of sample points in said higher resolution correlation 
function which locations are defined to sub-sample interval 
accuracy. 
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Publication Language: English 
Filing Language: English 
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Full text Availability: 
Detailed Description 
Claims 

Detailed Description 

... single color attributes. Hence, the length of a colored portion in a 
color scale is proportional to the number of graphic elements , 
whose corresponding indexation data falls into a given range 
represented by the colored portion. To do this.-- 

Claim 

scale (3 1) are designed so as to obtain a substantially even density 
0 of matching graphic elements for all positions of the marker (34) 
along the composite color scale (3 1... 
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05431812 E.I. No: EIP99124936743 
Title: Finding the collineation between two projective reconstructions 

Author: Csurka, Gabriella; Demirdjian, David; Horaud, Radu 
Corporate Source: GRAVIR-IMAG & inria Rhone-Alpes, Montbonnot Saint 
Martin, Fr 

Source: Computer vision and Image Understanding v 75 n 3 1999. p 260-268 

Publication Year: 1999 

CODEN: CVIUF4 ISSN: 1077-3142 

Language: English 

Document Type: JA; (Journal Article) Treatment: G; (General Review) 
Journal Announcement: 0001W4 

Abstract: The problem of finding the collineation between two 3D 
projective reconstructions has been proved to be useful for a variety of 
tasks such as calibration of a stereo rig and 3D affine and/or Euclidean 
reconstruction. Moreover, such a collineation may well be viewed as a point 
transfer method between two image pairs with applications to visually 
guided robot control. Despite this potential, methods for properly 
estimating such a projective transformation have received little attention 
in the past, in this paper we describe linear, nonlinear, and robust 
methods for estimating this transformation, we test the numerical stability 
of these methods with respect to image noise, to the number of matched 

points , and as a . function of the number of outliers. Finally, we 
devise a specialized technique for the case where 3D Euclidean coordinates 
are provided for a number of control points. (Author abstract) 17 Refs. 

Descriptors: *lmage reconstruction; Three dimensional; Robustness 
(control systems); Cameras; Computer vision; Two. dimensional 

Identifiers: Collineation; Projective reconstructions 

Classification Codes: 

741.1 (Light/Optics); 731.1 (Control Systems); 742.2 (Photographic 
Equipment) 

741 (optics & optical Devices); 731 (Automatic Control Principles); 742 
(Cameras & Photography) 

74 (OPTICAL TECHNOLOGY); 73' (CONTROL ENGINEERING) 
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Title: Improved optimum family genetic algorithm and its application for 

image matching 
Author (s): Wang Sun 'an; Li Jianhua; Yu Qing 

Author Affiliation: Sch. of Mech. Eng., Xi^an Jiaotong Univ., China 
Journal: Chinese Journal of Scientific instrument vol.26, no. 10 p. 
1027-30 

Publisher: China instrum. Soc, 

Publication Date: Oct. 2005 Country of Publication: China 

CODEN: YYXUDY ISSN: 0254-3087 

SICI : 0254-3087 (200510)26 : lOL . 1027 : lOFG ; 1-Z 

Material Identity Number: G383-2005-014 

Language: Chinese Document Type: Journal Paper (JP) 

Treatment: Practical (P); Experimental (x) 

Abstract: Based on the analysis of the speed and stability of the genetic 
alqorithm applied to functions with multi -modality and 
muTti -deceptive-problem, the improvement on powerful genetic algorithm 
(family genetic alqorithm) is put forward that individual evolvement is 
just based on not the whole population but the optimal family to avoid the 
premature phenomenon. At the same time, the new algorithm is applied to 
image matching to prove the improvement effective, in order to reduce 
the calculation amount on non-optimum points the sequence similar 



detection algorithm (SSDA) is introduced to be the fitness function . The 
experimental results indicate that improved optimum family genetic 
algorithm and SSDA can be benefited from each other. The whole algorithm is 
great effective in improving the speed of Image matching and its 
performance is steady. It can conclude that the new algorithm is potential 
in solving the similar problems. (9 Refs) 
Subfile: B C 

Descriptors: genetic algorithms; image matching 

Identifiers: optimum family genetic algorithm; Image matching ; 

sequence similar detection algorithm; fitness function 

Class Codes: B6135 (Optical, image and video signal processing); B0260 
(Optimisation techniques); C5260B (Computer vision and image processing 
techniques); C1250M (Image recognition); C1180 (Optimisation techniques) 
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Title: Polntwise digital image correlation using genetic algorithms 

Author(s): Jin, H.; Bruck, H.A. 

Author Affiliation: Dept. of Mech. Eng., Maryland Univ., Baltimore, MD, 
USA 

Journal: Experimental Techniques vol.29, no.l p. 36-9 
Publisher: Soc. Experimental Mech, 

Publication Date: Jan. -Feb. 2005 Country of Publication: USA 

CODEN: EXPTD2 ISSN: 0732-8818 

SICI: 0732-8818(200501/02)29: IL. 36: PDIC;1-S 

Material Identity Number: D751-2005-002 

Language: English Document Type: Journal Paper (JP) 

Treatment: Practical (P) ; Theoretical (T) 

Abstract: Digital Imade correlation (dig) has become an accepted 
method for measuring full-field surface displacement and displacement 
gradients in solid mechanics. The principle of DIG is to mathematically 
compare unique subsets of data from digital image in a reference 
configuration to digital images in deformed configurations in order to 
determine the deformation parameters that can be applied to the reference 
subsets that provide the best match to the deformed Image . The purpose 
of the work presented in this paper is to remove the constraint of constant 
displacements and displacement gradients within a subset, and permit the 
displacement field to vary discontinuously, as might be expected when a 
subset overlays an interface or crack. This will enable the technique of 
DIG to achieve the spatial resolution of alternative full -field deformation 
measurements techniques. Therefore, the kinematic description that is 
employed involves assessinq the displacement of each pixel independently 
(i.e., pointwise) with subpixel accuracy. This results in a much larger 
number of parameters to optimize in the associated correlation 
function . Therefore, a genetic algorithm (GA) is employed in the 
pointwise DIG technique to provide a simpler and faster optimization 
approach than is achieved using conventional gradient-based or coarse-fine 
search methods. (11 Refs) 
Subfile: C E 

Descriptors: correlation methods; cracks; genetic algorithms; image 
resolution; mechanical engineering computing 

Identifiers: pointwise digital Image correlation ; genetic algorithms; 
deformation measurements techniques; pixel; coarse-fine search method; 
conventional gradient-based search method; full -field surface displacement; 
solid mechanics 

class Codes: C7440 (Civil and mechanical engineering computing); C5260B 
(computer vision and image processing techniques); C1180 (Optimisation 
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14440844 PASCAL No.: 00-0099161 

Target matchinq 1n synthetic aperture radar imagery using a non-11near 
optimization technique 

Algorithms for synthetic aperture radar imagery vi : Orlando fl, 5-9 
April 1999 

METH R; CHELLAPPA R 
ZELNIO Edmund G, ed 

Department of Electrical Engineering and Center for Automation Research, 
university of Maryland, college Park, MD, United States 

International Society for Optical Engineering, Bellingham WA, united 
States. 

Algorithms for synthetic aperture radar imagery- Conference, 6 (Orlando 
FL USA) 1999-04-05 

Journal: SPIE proceedings series, 1999, 3721 532-542 

ISBN: 0-8194-3195-8 ISSN: 1017-2653 Availability: lNlST-21760; 
354000080090040490 

No. of Refs . : 25 ref . 

Document Type: P (Serial); c (Conference Proceedings) ; A (Analytic) 
Country of Publication: United States 
Language: English 

Recognition of targets in synthetic sperture radar (SAR) imagery is 
approached from the viewpoint of an optimization problem. Features are 
extracted from SAR target images and are treated as point sets. The 
matching problem is formulated as a non-linear objective function to 
maximize the number of matched features and minimize the distance 
between features. The minimum of this function is found using a 
deterministic annealing process. Registration is performed iteratively by 
using an analytically computed minimum at each temperature of the 
annealing. Thus, the images do not need to be initially registered as any 
trans! ational error between them is solved for as part of the optimization, 
we have also extended the initial objective function to incorporate 
multiple feature classes. This matching method is robust to spurious, 
missing and migrating features. Matching results are presented for 
simulated xpatch and real MSTAR SAR target imagery demonstrating the 
utility of this approach. 

English Descriptors: Synthetic-aperture radar; Automatic recognition; 
Target detection; Matching task; Image processing; Optimization; 
Pattern extraction; Experimental study 

French Descriptors: Radar ouverture synthetique; Reconnaissance automatique 
; Detection cible; Tache appariement; Traitement image; Optimisation; 
Extraction forme; Etude experi mental e 
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Automated knowledge-based system for stereo video metrology 

MOHAMMED TALEB OBAIDAT; WONG K W 

Civ. Engrg. Dept., Jordan Univ. of Set. and Technol . , (JUST), P.O. box 
3030, irbid, Jordan 

Journal: Journal of surveying engineering, 1996, 122 (2) 47-64 
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354000043212880010 
NO. of Refs. : 13 ref . 

Document Type: P (Serial) ; A (Analytic) 
Country of Publication: united States 
Language: English 

A knowledge-based system has been developed to heljD inexperienced users 
make measurements from stereo video images. The purpose of the system is to 
automate much of the routine functions and decision making in 
photogrammetric measurements on a personal computer (PC). The system can 
perform the following functions: (1) Check the validity of the input data; 
(2) warn of weak geometric conditions; (3) provide guidance, diagnostics, 
and counseling during success and failure modes; (4) conduct robust blunder 
detection; and (5) perform accuracy analysis through error propagation. The 
result was the development of a user-friendly vision system that can be 
used productively without in-depth knowledge of photogrammetry . 
Experimental results showed that the PC-based vision system achieved a 
potential accuracy of about one pixel on the image plane for planar 
coordinates. Lower measurement accuracy in the range of 4-5 pixels was 
obtained for the depth direction because of the intersection geometry and 
accuracy limitations in manual Image matching . The statistical analysis 
scheme, based on random error propagation of the image coordinates, was a 
realistic accuracy estimator. Calculated three-dimensional (3d) measurement 
errors consistently fell within three times the estimated standard errors 
(3 Sigma ). Comparison with actual survey measurements showed that 
distances could be measured with an accuracy of better than 2 pixels, while 
volume and surface area were measured to within 3%. Image scale, base/ 
object distance ratio , number and distribution of control points , 
and accuracy limitation in manual matching had a significant impact on 
the measurement accuracy. 

English Descriptors: Stereometry; video technique; Photogrammetry; 
Photogrammetric survey; Knowledge base; image analysis 

French Descriptors: Stereometrie; Technique video; Photogrammetrie; Leve 
photogrammetrique; Base connaissance; Analyse image 
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Title: AUTOMATED CORRELATION OF INTRAVASCULAR ULTRASOUND IMAGES WITH 
ANGIOGRAPHY 

Author(s): GOWDA A; GOJER B; MOTAMEDI M; DAVIS MJ ; FARRELL RW; RASTEGAR S; 

MILLER GE; KRONENBERG MW 
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Abstract: One limitation of intravascular ultrasound (ivus) is the 
restriction to viewing one cross-sectional image at a time. 
Computerized three-dimensional reconstructions of IVUS images have been 
developed in an attempt to overcome this limitation. These algorithms, 
however, are limited by artifacts from catheter movements and rotation 



within large vessels. Consequently, this technique has been applied 
only to straight segments of small caliber vessels. Contrast 
. angiography has long been the standard for vascular imaging, in order 
to take advantage of both contrast angiography and IVUS, we developed a 
computer procedure to automatically correlate IVUS images with 
their corresponding locations on contrast angiograms, and to display 
both images in a side by side format. Models of the aortic arch and 
aorto-ileo-femoral system were constructed with artificial plaques 
located at various sites. The models were filled with iodinated 
contrast media and radiographic images were obtained. Timed pull-backs 
were performed in both models in order to obtain sets of serial 
cross-sectional images. For each data set, a digitized set of 75 
serial ivus images and model angiographic images were loaded in the 
computer procedure. We then correlated at least one ivus image 
containing a known landmark with its position on the model angiogram. 
The procedure then automatically displayed sequential ultrasound images 
along with their corresponding positions on the reference angiogram. 
We analyzed the error of this algorithm as a function of the number 
of correlation points used. The maximum error was 4 mm over a 
total pullback distance of 130 mm (relative error of 3%). This 
algorithm was subsequently used to correlate ivus images obtained 
from the aortic arch of a patient with their corresponding positions on 
an aortogram. Our results demonstrate that computer-based correlation 
of IVUS images with their corresponding positions on angiograms is 
accurate, may enhance the use of IVUS to assess vascular pathology, and 
provides an alternative to three-dimensional reconstructions. 



